\documentclass[11pt,a4paper]{article} 
\usepackage{latexsym}
\usepackage{paralist}
\usepackage{amssymb}
\usepackage{url}
\usepackage{graphicx}

\parskip        0mm
\oddsidemargin  0in
\evensidemargin 0in
\textwidth      6in
\topmargin      0in
\marginparwidth 30pt
\textheight     9.2in
\headheight     .1in
\headsep        .1mm 

\title{VU Programm- und Systemverifikation\\ 
\Large Homework: Propositional Logic, Hoare Logic, and Test Case Generation}
\author{~}
\date{{\bf Due date:} May 13, 2021, \underline{1pm}}

\usepackage{listings}
\lstset{
  basicstyle=\footnotesize\ttfamily, % Standardschrift
  numbers=left,               % Ort der Zeilennummern
  numberstyle=\tiny,          % Stil der Zeilennummern
  numbersep=5pt,              % Abstand der Nummern zum Text
  tabsize=2,                  % Groesse von Tabs
  extendedchars=true,         %
  breaklines=true,            % Zeilen werden Umgebrochen
  keywordstyle=\ttfamily,
  stringstyle=\ttfamily, % Farbe der String
  showspaces=false,           % Leerzeichen anzeigen ?
  showtabs=false,             % Tabs anzeigen ?
  xleftmargin=17pt,
  framexleftmargin=17pt,
  framexrightmargin=5pt,
  framexbottommargin=4pt,
  showstringspaces=false      % Leerzeichen in Strings anzeigen ?
}
\lstloadlanguages{ 
  C , C++, ml
}


\gdef\dash---{\thinspace---\hskip.16667em\relax}
\gdef\smdash--{\thinspace--\hskip.16667em\relax}
\newcommand{\emn}[1]{{\em #1\/}}
\gdef\op|{\,|\;}

\newcommand{\true}{{\sc true}}

\begin{document}

\maketitle

\paragraph{Task 1 (6 points):}

Prove the Hoare Triple below (assume that the domain of 
the  variables {\tt t}, {\tt m}, {\tt n} are the integers
($\mathbb{Z}$).
You need to find a sufficiently strong loop invariant.

Annotate the following code directly with the required assertions.
\underline{Justify} each assertion by stating which Hoare rule you used
to derive it, and the premise(s) of that rule. If you strengthen
or weaken conditions, explain your reasoning.

\noindent No points
for assertions  not derived by using one of the rules from the lecture!

\vspace{.5cm}

(Below) in Figure \ref{fig:hoare-1} a proof is outlined with annotated asserations and annotated justifications. The rules used are the five as introduced in the lecture, but abbreviated: assignment rule (ASSIGN), composition rule (COMP), condition rule (COND), consequence rule (CONS) and loop rule (LOOP). The expression $P$ for the assignment rule has been left out, because it should be clear what $P$ is.

This solution is not yet complete: The program is correct only by after applying the composition rule, but because this would have made the figure considerably more ugly, this is supplemented with a textual explanation: We know because of the condition rule that $\{ true \}\ C_1\ \{ m \leq n \}$ holds (where $C_1$ is the entire if statement body) and we know because of the loop rule that $\{ m \leq n \}\ C_2\ \{ m \geq n \land m \leq n \}$ holds (where $C_2$ is the entire loop body). By applying the composition rule on $C_1$ and $C_2$ with $P = true$, $Q = m \leq n$ and $R = m \geq n \land m \leq n$, we know that $\{P\}\ C_3\ \{R\}$ holds (where $C_3$ is the entire program).

\begin{figure}
  \centering
  \includegraphics[scale=.95]{./res/1-hoare.png}
  \caption{Hoare proof}
  \label{fig:hoare-1}
\end{figure}

\clearpage

\paragraph{Task 2 (5 points):}

Prove the Hoare Triple below (assume that the domain of 
all variables except {\tt done} in the program are the unsigned
integers including zero, i.e.,
${\tt i}, {\tt n}\in\mathbb{N}\cup\{0\}$, and that
{\tt done} is a Boolean variable).
You need to find a sufficiently strong loop invariant.\\

Annotate the following code directly with the required assertions.
\underline{Justify} each assertion by stating which Hoare rule you used
to derive it, and the premise(s) of that rule. If you strengthen
or weaken conditions, explain your reasoning.

\vspace{.5cm}

The solution to this task follows the same conventions as in task 1. Figure \ref{fig:hoare-2} outlines a Hoare proof of the program. Like in task 1, we need to supplement an application of the condition rule, but before we do that, a little elaboration is needed on the claims of equivalence throughout the figure.

The first one is trivially true, that is:

$$
n \geq 1 \Leftrightarrow n \geq 1 \land 1 \geq 1
$$

The second one looks tricky, but is also true:

$$
n \geq i, i \geq 1, i\ mod\ n \not \equiv 0
$$

it follows that

$$
i\ mod\ n \not \equiv 0 \land n \geq i \Rightarrow n > i
$$

and

$$
i\ mod\ n \not \equiv 0 \land i \geq 0 \Rightarrow i > 0
$$

so finally

$$
n > i \land i > 0 \Rightarrow n \geq i + 1 \land i \geq 0
$$

The third one must also be true:

$$
n \geq i \land i \geq 1 \land i\ mod\ n \equiv 0
$$

because

$$
i\ mod\ n \equiv 0 \Longrightarrow i = k n, k \in \mathbb{Z}
$$

combined with the fact that $i \geq 1$ and our knowledge that $k = 1$. We know $k = 1$ to be true because of $i \geq 1 \land n \geq i$.

All that is left is to apply the composition rule: $\{P\}\ C_1\ \{Q\}, \{Q\}\ C_2\ \{R\}$ with $C_1 =$ \texttt{i + 1} and $C_2$ being the entire loop body as well as $P = n \geq 1$, $Q = n \geq i \land i \geq 1$ and $R = i = n$. It follows that $\{P\}\ C_1;\ C_2\ \{R\}$

\begin{figure}
  \centering
  \includegraphics[scale=.95]{./res/2-hoare.png}
  \caption{Hoare proof}
  \label{fig:hoare-2}
\end{figure}

\clearpage

\paragraph{Task 3 (2 points):}

  Construct an Ordered Binary Decision Diagram that encodes the following
  Boolean formula:
  \begin{equation}
    (x_1 \Leftrightarrow x_2) \wedge (x_1)
  \end{equation}

  \vspace{.5cm}
  \noindent Ilustrate
  the construction steps and not just the final result! 

\begin{table}[h!]
  \begin{center}
    \caption{truth table for $x_1 \Leftrightarrow x_2$}
    \label{tab:truth-table}
    \begin{tabular}{c|c|c} % <-- Alignments: 1st column left, 2nd middle and 3rd right, with vertical lines in between
      $x_1$ & $x_2$ & $x_1 \Leftrightarrow x_2$ \\
      \hline
      0 & 0 & 1\\
      0 & 1 & 0\\
      1 & 0 & 0\\
      1 & 1 & 1\\
    \end{tabular}
  \end{center}
\end{table}

To solve this task, we begin by displaying two trees representing two subtrees that we want to combine (with their corresponding expressions), just like in the lecture, in Figure \ref{fig:initial-bdds}. One of the subtrees is $x_1 \Leftrightarrow x_2$, which we could have expanded into either of the two normal forms and constructed like we are about to do our two subtrees, but this was omissed because it would be repetitive. Another argument is that the BDD for $x_1 \Leftrightarrow x_2$ is easily constructed by looking at the truth table Table \ref{tab:truth-table}.

For this example, we chose the order of variables to be $x_1, x_2$.

\begin{figure}
  \centering
  \includegraphics[scale=.5]{./res/3-initial-bdds.png}
  \caption{initial subexpression BDDs}
  \label{fig:initial-bdds}
\end{figure}

In this case we are combining two BDDs using the $\land$ operator. To combine the BDDs, we extract the next variable in our variable order, $x_1$, using Shannon expansion (Figure \ref{fig:combined-bdds}). The left subtree ends up as $(0 \Leftrightarrow x_2) \land 0$, which is just $0$. The right subtree ends up as $(1 \Leftrightarrow x_2) \land 1$, which can be simplified as shown in the figure:

$$
(1 \Leftrightarrow x_2) \land 1 = (1 \Rightarrow x_2) \land \overbrace{(x_2 \Rightarrow 1)}^{1} = \neg 1 \lor x_2 = x_2
$$

\begin{figure}
  \centering
  \includegraphics[scale=.5]{./res/3-bdd-transformation.png}
  \caption{combining the BDDs}
  \label{fig:combined-bdds}
\end{figure}

Putting these facts together, we arrive at the BDD in Figure \ref{fig:result-bdd}, which is already the result.

\begin{figure}
  \centering
  \includegraphics[scale=.5]{./res/3-result.png}
  \caption{resulting BDD}
  \label{fig:result-bdd}
\end{figure}

\vfill


\clearpage

\paragraph{Task 4 (2 points):}

Assume that {\tt x} and {\tt y} are bit-vectors of width 2.
Provide an encoding of the right arithmetic shift operation
{\tt x >> y} (i.e., the most significant bit indicating the sign
is replicated to fill in all vacant positions).

\vspace{.5cm}
\noindent Use a similar circuit-based illustration as on the
slides for the lecture on propositional logic.
\begin{figure}
  \centering
  \includegraphics[scale=.6]{./res/4-implementation.png}
  \caption{right arithmetic shift encoding}
  \label{fig:encoding}
\end{figure}

Figure \ref{fig:encoding} describes an encoding for this task. The input and outputs are specified on the upper left. The operation is described in the figure in three different ways, namely graphically (lower left), using predicate logic (upper right) and finally in the form of a circuit that implements said operation (lower right). For clarity, the operation is defined as:

$$
(x_1 x_0)_b >> (y_1 y_0)_b = (z_1 z_0)_b
$$

with:

$$
z_1 = x_1, z_0 = ((y_1 \land y_0) \land x_1) \lor ((\neg y_1 \land \neg y_0) \land x_0)
$$

In particular, $z_1$ is always $z_0$, even if no shift is to be applied. $z_1$ turns out as $x_1$ iff $(y)_d$ interpreted as an unsigned decimal number is bigger than $0$. Even though $(y)_d$ can take on several values above $0$, applying the right arithmetic shift operation multiple times or just once won't change the outcome, because the MSB $x_1$ is already copied to both $z_1$ and $z_0$ after one shift.

The circuit is realised as a MUX that chooses $z$ as $(x_1 x_1)_b$ if either $y_1$ or $y_0$ are $1$ and $(x_1 x_0)_b$ otherwise.

\clearpage

\paragraph{Task 5 (5 points):}
Use the \textsc{Klee} symbolic simulator (using the Docker image
from \url{klee.github.io} as explained in the lecture) to test
the following implementation of Euclid's algorithm:

\noindent
\begin{lstlisting}
unsigned gcd (unsigned x, unsigned y)
{
  unsigned m, k;
  if (x > y) {
    k = x;
    m = y;
  }
  else {
    k = y;
    m = x;
  }

  while (m != 0) {
    unsigned r = k % m;
    k = m; m = r;
  }
  return k;
}
\end{lstlisting}

Use \textsc{Klee} to generate test inputs from the following
\emph{specification}:

\noindent
\begin{lstlisting}
#define MIN(x, y) ((x)<(y))?(x):(y)
#define MAX(x, y) ((x)<(y))?(y):(x)
#define IS_CD(r, x, y) (((x)%(r)==0)&&((y)%(r)==0))

unsigned gcd (unsigned x, unsigned y)
{
  for (unsigned t = MIN (x,y); t>0; t--) {
    if (IS_CD(t, x, y))
      return t;
  }
  return MAX(x, y);
}
\end{lstlisting}

(The source code of both implementations can be downloaded from TISS.)

\begin{itemize}
\item How many test cases are required \emph{at least} to 
  achieve branch coverage for the implementation?
\item Provide a \emph{minimal} number of test cases generated
  with {\sc Klee} such that branch coverage for the implementation
  is achieved!
  \begin{center}
    \begin{tabular}{|c|c||c|}
      \hline
      {\tt x} & {\tt y} & {\tt gcd(x,y)} \\
      \hline
      0 & 0 & 0\\
      4294967295 & 1 & 1\\
      \hline
    \end{tabular}
  \end{center}
  \vfill
\end{itemize}

\clearpage
\begin{itemize}
\item If a given test suite achieves branch coverage
  for the specification, does the same test suite also
  achieve branch coverage for the implementation \ldots
  \begin{compactitem}
  \item for this specific example?
  \item in general?
  \end{compactitem}
  For both cases, explain why!
\end{itemize}

{\bf Hint:} To replay the test cases, you need to make sure that
the environment variable {\tt LD\_LIBRARY\_PATH} points to the
directory {\tt /home/klee/klee\_build/klee/lib} containing
the library {\tt libkleeRuntest.so.1.0}!

\vspace{.5cm}

To answer the first question, the implementation in question has two unnested branches caused by the \texttt{if} at line 4 and \texttt{while} at line 13 respectively. Because these branches are unrelated to each other, we only need at least \textit{two} test cases to achieve branch coverage (one test case where we take the first branch but not the second, and one test case where we take the second but not the first).

In the following section we describe our methodology to find (and prove) that two test cases are enough to achieve branch coverage. \texttt{impl.c} refers to the implementation, while \texttt{spec.c} refers to the specification. The main functions of both these programs were written such that unsigned integers with the same variable name are made symbolic for test generation, such that the test cases can be replayed on either version. First we must generate our testcases:

\noindent
\begin{lstlisting}
cd /home/klee/gcd
clang -I ../klee_src/include -emit-llvm -c -g impl.c -o impl.out
clang -I ../klee_src/include -emit-llvm -c -g spec.c -o spec.out
klee spec.bc
\end{lstlisting}

In our case, we waited until 30 test cases were generated. Now we can replay all of these against both programs:

\noindent
\begin{lstlisting}
for i in klee-last/*ktest; do
    KTEST_FILE=$i ./impl.out;
    echo "testing $i against impl: $?";
    KTEST_FILE=$i ./spec.out;
    echo "testing $i against spec: $?";
done
\end{lstlisting}

The programs come to the same conclusion for all the test cases. Of course, this does not conclusively show that our implementation is correct, but it's a start. Next up, we want to check the coverage. For that, we newly compile both our programs:

\noindent
\begin{lstlisting}
clang --coverage -I ../klee_src/include/ -L ../klee_build/lib/ impl.c -lkleeRuntest -o impl.out
clang --coverage -I ../klee_src/include/ -L ../klee_build/lib/ spec.c -lkleeRuntest -o spec.out
\end{lstlisting}

The first thing we are interested in is achieving branch coverage for our \textit{implementation}, minding that we set out to use the minimal amount of test cases, which we argued to be two earlier. We can devise a short shell gimmick to do our work for us:

\vfill

\noindent
\begin{lstlisting}
cp impl.gcno impl.gcno.orig
(for i in {0..2}; do
    for j in {1..9}; do
        for k in {0..2}; do
            for l in {0..9}; do
                echo "tests $i$j and $k$l:";
                rm impl.gcda impl.c.gcov;
                cp impl.gcno.orig impl.gcno;
                KTEST_FILE=klee-last/test0000$i$j.ktest ./impl.out;
                KTEST_FILE=klee-last/test0000$k$l.ktest ./impl.out;
                llvm-cov gcov impl.gcno;
            done;
        done;
    done;
done | grep 100 -B 2) &> impl.c.coverage
\end{lstlisting}

Note: yes, I only noticed afterwards that the second for loop goes from 1 to 9, I've decided to let it be

We end up with a file \texttt{impl.c.coverage} that has content in the following format (showing only the first five results):

\noindent
\begin{lstlisting}
tests 01 and 04:
File 'impl.c'
Lines executed:100.00% of 16
--
tests 01 and 06:
File 'impl.c'
Lines executed:100.00% of 16
--
tests 01 and 08:
File 'impl.c'
Lines executed:100.00% of 16
--
tests 01 and 10:
File 'impl.c'
Lines executed:100.00% of 16
--
tests 01 and 12:
File 'impl.c'
Lines executed:100.00% of 16
...
\end{lstlisting}

We can also easily check now whether any test cases - and whether the same test cases - achieve full branch coverage for the \textit{specification} instead of the implementation (albeit this might take much longer). Incidentally, the test cases chosen for the table above uses tests 01 and 04 that were found using this way. Cherrypicking our previous examples again yields us:

\noindent
\begin{lstlisting}
tests 01 and 04:
File 'spec.c'
Lines executed:90.91% of 11
---
tests 01 and 06:
File 'spec.c'
Lines executed:100.00% of 11
---
tests 01 and 08:
File 'spec.c'
Lines executed:100.00% of 11
---
tests 01 and 10:
File 'spec.c'
Lines executed:100.00% of 11
---
tests 01 and 12:
File 'spec.c'
Lines executed:100.00% of 11
...
\end{lstlisting}

We can answer with confidence now: test suites that yield full branch coverage for our implementation do not necessarily achieve full coverage for our specification, although we know that two more appropriately chosen test cases are sufficient. So it cannot be the case that this applies to any two programs in general. But the original question was backwards: whether full branch coverage for our specification implies full branch coverage for our implementation. To compare this, we just look at some more test suites that achieve full coverage for the specification:

\noindent
\begin{lstlisting}
tests 01 and 05:
File 'spec.c'
Lines executed:100.00% of 11
--
tests 01 and 06:
File 'spec.c'
Lines executed:100.00% of 11
--
tests 01 and 07:
File 'spec.c'
Lines executed:100.00% of 11
--
tests 01 and 08:
File 'spec.c'
Lines executed:100.00% of 11
---
tests 01 and 09:
File 'spec.c'
Lines executed:100.00% of 11
...
\end{lstlisting}

These are the first five that pop up. We can already see three test suites that did not show for our implementation, namely \texttt{tests 01 and 05}, \texttt{tests 01 and 07} and \texttt{tests 01 and 09} (the coverage for these five test suites for our implementation will follow below). As such, we can also answer the original question, full branch coverage for our specification does not imply full branch coverage for our implementation.

\noindent
\begin{lstlisting}
tests 01 and 05:
File 'impl.c'
Lines executed:81.25% of 16
impl.c:creating 'impl.c.gcov'
---
tests 01 and 06:
File 'impl.c'
Lines executed:100.00% of 16
impl.c:creating 'impl.c.gcov'
---
tests 01 and 07:
File 'impl.c'
Lines executed:81.25% of 16
impl.c:creating 'impl.c.gcov'
---
tests 01 and 08:
File 'impl.c'
Lines executed:100.00% of 16
impl.c:creating 'impl.c.gcov'
---
tests 01 and 09:
File 'impl.c'
Lines executed:81.25% of 16
impl.c:creating 'impl.c.gcov'
...
\end{lstlisting}

\vfill
Note: I got a little caught up in describing how I use the tools, I hope the explanation was insightful and not too long.

Upload a pdf file with your solutions to TUWEL by May 13, 2021.

\end{document}
